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Abstract 

In this paper the branching trees for attacking MILP are reviewed. 
Under certain circumstances branches can be done concurrently. This is 
fully investigated with the result that there are restrictions for certain 
dual values and reduced costs. As a sideeffect of this study a new class of 
cuts for MILP is found, which are defined by those values. 

1 Motivation of the following thoughts 

Nowadays the technique for doing MILP (Mixed Integer Linear Programming) 
is based on the branch and bound method. This method uses the best solution 
of the hnear inequahty system with objective function (= LP-instance) by leav- 
ing out the integer conditions from the mixed integer hnear inequality system 
with objective function (= MILP-instance). Then this method searches for an 
0-1 (or integer) variable Xn, which has a non- integer value g„. The next step 
is to create two new LP-instances by adding first a;„ — (or Xn < [in]) and 
secondly Xn — 1 {xn > [(/„ + 1]). By continuing this process a binary tree of 
problems is created. 

Now take two different nodes in this tree, so you look at two different LP- 
instances. With both problems it is possible that some Xn is still not integer. 
We'll create a branch on that variable for both problems. It can happen at a big 
and sparse MILP-instance, that the same similar branching will lead to exactly 
the same calculations at the new LP-instances. From a numerical point of view 
this is unsatisfactory. 



New ideas have been developed here which use some kind of independence of 
branching. These will help to prevent such double calculations. One further aim 
of these new techniques is a better measurement and control of what happens 
at a branching. A practical and short-term outcome should be better limits for 
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huge MILP-instances. It should be noticed that the prominent group of huge 
Travehng Salesman Problems is a part of this group. As a matter of fact, this 
group was indeed the starting point of the author's thoughts about this topic. 

We'll show that the combination of branches can be described by an ordinary 
linear inequality system, so that the problem to get an optimal combination of 
branches will be a LP-instance (luckily not a MILP-instance) . We'll reach this 
formulation at the middle of the second section at theorem 12. 71 We want to use 
instead of a binary tree of depth n, which has 2" problems, just 2n problems. 
We'll try to combine the solutions of the 2n problems as well as possible to get 
a bound for the original problem (MILP-instance), which will be better than 
the LP-bound but normally not as good as the bound by solving all 2" problems. 

We'll furthermore see that it is even possible to define a very huge LP for each 
MILP, which represents the ability to combine the several case differentiations. 

The main idea is not too difficult: 

We'll measure the differences of the dual variables and the gain of the objective 
function when creating new problems, which each has one inequality more than 
the starting LP-instance. These differences of the dual variables are naturally 
connected to the branches. Then we'll choose those differences of dual variables, 
so that for all combinations of choices at the connected branches, all dual in- 
equalities will hold for sure. By adding the gain of each chosen branching, we 
get a total gain, which gives a better limit of the original problem. 

It should be noted that the whole paper has been fully elaborated by the author. 
In fact the only real reference are the basic facts about LPs as presented in [1]. 

2 Description of the technique in a very broad 
context 

2.1 Basic terminology and central theorem 

In the following we examine a problem P, which can be partially represented as 
a minimal linear problem Pi . The desription of the whole problem needs some 
additional case differentiations. It should be remarked that the set of MILP- 
instances is a real subset of this problem class. The linear problem Pi has a 
set {um} of inequalities in variables {a;„}. Without loss of the generality we 
assume that all {j/m} are >-inequalities. We'll argue later why equations may 
be excluded from the scope. Furthermore we expect that the inequalities named 
by Um represent all bounds to the Xn- 

Since the term branching has been used in LP-terminology in quite a lot of 
places with an emphasize of really creating of one problem two problems, a new 
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terminology will be introduced. We use the terms of cases and files instead. A 
case Cij will stand for the evaluation of one possibility j of a case differentia- 
tion i, the sum of the cases Cij make together a file Fi, which will be in other 
words the case differentiation. But we'll soon define it more concretely. We 
shall examine, when and how the files can be combined to get a higher lower 
limit for the optimal solution. 

Therefore we'll start from the dual point of view, so w will be considered as 
the objective function of the dual problem. To ease the notification, we state 
that the indices {m} and {n} have empty intersection. Via defining the index 
{r} as the union of both we get something that will help us in making all for- 
mulation much easier. 

The dual solution space of Pi will be noted as Vq with optimal subspace Lq 
and optimal value ujq. Furthermore we chose an arbitrary y £ Lq- By looking 
at one case j of the case differentiation i the dual solution space is Vij. But 
we'll restrict this solution space by {iVij > ujq} to get a polytope Pij. Now let 
the case Cij be the following set {cij \ y + Cij e Pij}- So it is a movement to 
of Pij. Our objective function uji j can easily expanded to Cij just by setting 
it to ujij{ci j), which is the same as i^ijiy + Ci_j) — Ui jly) due to the linearness 

of OJj J . 

Definition 2.1 Now choose Cij £ Cij and define A™ (qj) — ym~{y+Ci,j)m = 
~{cij)m 0.S the change of the m-th dual variable. Also define Afj{cij) = rn{y) — 
Txiy + Cij), where r„ shall be the reduced cost of Xn- By this we also define A^^ . 

It is important to state that the A"j (cij) can be calculated by the A™ (ci^j) 
with additional info of the value of the new dual variable(s) y -f- Ci.j. If we have 
equalities as conditions, we'll see that those dual values give no interesting A"*- 
values, but the A"-values can be calculated with the help of these values. We 
also define the vector / = (y, ?■„(?/)), this vector has coordinates in r. 
As y -|- Ci J is a dual solution, all coordinates ?/„ most be positive or null. The 
same holds for the linear function r„, the value Rx^iu -'r Cij) must be positive or 
null. Putting these facts together, we get the following remark, which already 
has the structure of our main statement 12.61 

Remark 2.2 

For later purpose we also investigate the property of hnearity of the A™ (ci^j) 
and the A"^ (ci_j) and so of the A[_j (cij ). We easily see that everything is linear: 
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Remark 2.3 

A- (Ac,,,) = AA-(c.,,) 

A^, (c„- + d,., ) = A^^. (c, ) + A^;;. (d,,, ) 

Ai:,(Ac.,,) = AA^^.(c.,,) 

Ai:,(c.,,+d,,,) = A^^(c.,,) + A^^(d,,,) 

In 12. 21 we call those inequalities, where the right hand side is greater the main 
inequalities. 

Remark 2.4 If Cij relates to an optimal solution of Pij, then one main in- 
equality is sharp. 

Too see this we assume that this is not the case. We consider y + (1 + e)cij, 
which has higher objective value and the inequalities in 12.21 still hold. As those 
inequalities make the dual variable related to the inequalities of Pij positive 
and the dual inequalities of Pij true. We follow that (1 + e)cij is in Cij. This 
is a contradiction to the optimality of c^.j. So one of the main inequalities must 
be sharp. The other non-main inequalities are in fact trivial, since the values 
here for y itself are already sharp, so the A-values must be negative or null. 

The next step of our thoughts is to go from a case to the case differentiations, 
which will be named as files as announced. Let Fi = 0^ Cij be a file. If we 
take an element fi — (BjCi.j out of our construct Fi. We further define: 

LOiifi) = mm{u!i,j{ci,j)) 
j 

Ar(/.) = max(A[,,.(c„)) 

3 

The delta represent the highest differences of the changes within a case differ- 
entiation (file) of the dual variables and the dual inequalities. 
Based on 12.31 vou get easily the following equalities and inequalities. 

Remark 2.5 

Ar(A/.) = AAr(/,) 
Ar(/,+5,) < Ar(/,) + Ar(5,) 

Wi(A/i) = Awi(/j) 

'^iifi + 9i) > ^iifi) + ^i{9i) 

So you can conclude that the deltas are still convex and LOi is concave. 

Now we even build a more complex space 7^, which will be the sum of all files 

and our final object. This space represents parallel files. 

i 

with the foUowing functions for p ^ V: 

A^(p)^^Ar(/,) 
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i 

Now let p ^ P, then we have chosen in aU cases of all files a solution vector. 
Remember, that we are always talking about dual solutions and variables. If we 
chose for each file Fi a case Cij(^i), then we have a new problem P/, which is in 
fact Pi together with the inequalities from all Ci_j(i). For this we can calculate 
a solution y by the means of the A: If we look at y^m then the value is: 

Vra = Vra - A™ (c,,,(,) )) > - ^ I^T ih) = Vm - {p) 

i i 

So by Ap(p) < j/m, we can make sure, that the dual variable ym is positive. 
Since this is not important for equations, we only considered inequalities before. 
Notice also that the condition is independent of our choice 

For the validity of the n-th dual inequality we got something similar: 

A^ip) < My) ^ 

< r„(y) 

Keep in mind, that also rn{y) can be calculated by the A™-^^^ (ci^j(i)) with the 
help of new dual variables of all chosen cases. This is the same as the Afj(ci.j) 
could be derived from the A™ (ci.j), as we have seen before. 

The objective value of y is ljq + X^i ^ '^'p{p)- Putting these thoughts 

together we get our central statement. 

Theorem 2.6 (Central theorem) If for all r holds, that Ap(p) < Ir, then 
the original problem P must have an optimal solution that is greater than uj-p{p), 
which is greater than the original ujq of the linear problem Pi. 

2.2 Building the little combining LP 

The last statement seems to be rather abstract, but by an easy trick, we'll get 
two different forms, that can be used in an algorithm. To get the first we just 
substitute fi by Xifi with Xi > 0. As the deltas and the objectives Ui are linear 
on a scalar (|2.5p . we get the main result of this article: 

Theorem 2.7 (Central theorem - simple form) The ability to combine case 
differentiations can be assured by the following inequalities: 

i 

The new lower limit is +^o- So by solving this LP-instance in Xi 

we get better lower limit for our problem P. 
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By looking at all Pi,j all values in this LP-instance can be calculated, first the 
(y + Cij)m and Rx„{y + Ci^), secondly A™ (qj), A^^(cij), uj{cij) and lastly 
the A™(/0, (/,) and the 

By the definition of uji it is natural to choose the Cij in such a way that for all 
j the equation Ui{fi) = uJij{cij) holds. This can be achieved by substitution of 
Ci,j by uji{fi)u}{cij)~^Cij. This is for our purposes well-defined, because when 
uj{cij) = then we get no progress on the objective function of P from this case 
differentiation. By this substitution in 12. 71 the objective function u>i{fi) remains 
the same, but normally the A-values will decrease, leading to higher values when 
using the practical form of the central theorem. We call this trick normalization. 

As a next step we want to generalize 12.71 We substitute in 12.61 f, hy '}2,^i,kfi,k 
and use the convexness of A[ (|2.5p to get the second sum in the upcoming the- 
orem EH 

Suppose furthermore that you have not only got one optimal solution of Plp, 
but another solutions y + cq G Lq. This other solution can be found in a case 
Co = Fq, which by itself already will be a file Fq. We have furthermore a natural 
function A"" defined as above on this case. In the argumentation to l2.6l we could 
have introduced this special case without any problems. By this we get an extra 
term of — cq in the calculation of {/m- Since y + cq is an optimal solution of Pi, 
there will be no quality growth directly related to cq. So we don't have to define 
a function uj for it. 



Theorem 2.8 (Central theorem - more complex form) The ability to com- 
bine case differentiations can be assured by the following inequalities: 

^Ao,fcA'-(4)-f^A.,fcA[(/f)</, 

k i,k 

The new lower limit is \i^k^i{ff) -\- luq. So by solving this LP-instance S 
in Xi^k cind Ao,fc we get a better lower limit for our problem P. 

Although this formulation seems to be much stronger than the version 1, this 
is not really the case. Looking at the second sum we see, that the A^^^ can be 
created in 12.71 bv choosing Fi-^j = Fi^j for ii ^ ii. This is possible because it 
was never stated that we made a case differentiation only once. 
But by the creation of 12.81 we see something different: If some Xi^kx and \i^kn 
with fci ^ ki are non-null for an optimal solution of the resulting LP-instance 
in 12.81 then we can find better values by setting f^j^ = K^kifiM + KMfiM- 
by generating new columns we can sometimes improve the bound for P. 
Also the first sum in 12.81 has some limitations. Suppose the following values: 
y^ = 1, (y + eg)™ = 0, A-'(/,J = A™(/,J = 1. Then the files for zi and ^2 
cannot be combined fully. But if you exchange y and y -\- Cq, they can, because 
then A™(/ij) = A"^{fi^) = holds. So we should not expect that the first sum 
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in 12.81 to help us very much for our needs. 

We have presented in this section a theory on a problem, which is described 
by a LP too weak. But in truth we studied the dual LP, where the problem 
is described too sharp and can be weakened by case differentiations. As 12.71 
can be weakened by natural case differentiation by fixing for one i and one j 
= A^j-, we could use the whole theory on it. This self- appliance is surprising 
and fascinating. We will sketch one manual example later. Even if this looks 
interesting on the first glance, not much progress on the lower bound is expected 
by this iteration. 

Although the mathematical formulation to combine case differentiations (files) 
has been explained broadly in this section, some details are still not covered. 
The problem is that those details might be not too easy to attack at all. When 
you think of a fast implementation of this idea you want to have an effective, 
numerical stable and fast algorithm to find good elements Ci,j £ C^.j, where 
most Cij are zero. By these you get good fi for a given solution y. We'll see 
later in [3] that the normal approach to use optimal solution of the j leads 
in some examples to problems. So later in 15.21 and 15.31 we will attack these 
problems by using non-optimal y + Cij even before normalization, where most 
A™ (ci j ) and A"^ (ci.j) should be zero. 

2.3 Building the huge combining LP 

In this section we will follow again the made definitions and results and reach 
a mathematical satisfactory formulation of the theory. 

We had started with one solution y G Vq. For each case of a file we have also a 
yij G Vi,j- Notice that the reduced costs of one variable of yo (and yij) are by 
definition just a linear equation dependent of the (?/o)n i{yi,j)n)- So we define 
as in 12. II the A^j as variables which are calculated by linear equations from yo 
and yij, the same holds for the which are also linear dependant on yo and 

Via the restrictions A[ > A[^ and Ui < uji,j for all j we have defined A[ and 
LOi as linear inequalities. Like before we define A^ as a sum of the A[ and uj-p 
as the sum of the uji plus the objective lu oi Pi, which is also a linear term of yo. 
Via using the restrictions of 12. 61 and setting the objective to cu-p we have defined 
now a very huge LP ^'(lomb" ''^^ '^"^^ easily formulate a theorem, which 
describes the problem of doing case differentiations in parallel in a mathematical 
satisfactory way: 

Theorem 2.9 (Central theorem - complete form) Each solution of P^^^^ 
represents a lower bound of P. 

The optimal value is the optimal lower bound possible via our combining 
technique. 

We restricted ourselves from writing all inequalities explicitely down, as the huge 
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amount of indices for each variable might only be confusing and all inequalities 
have already been described impliciteley. 

But it should be noticed that the definitions in 12.71 12.81 and 16.11 are just 
tightened and less complex inequalities systems than this system. 
Consider that P has n variables and m restrictions and that all variables are 
binaries, so a simple case differentation can be made on each variably, then this 
new LP would have at least 2nm variables. For small sized problems this might 
still numerical possible to be calculated. 

This huge LP should not be attacked for optimal values in the author's view 
because of the not avoidable high computing time but searched for good solu- 
tions in a effective manner. 

It should be mentioned that it is possible to use the theory again and to for- 
mulate a construction a LP of dimension ArP'm which should give better lower 
bounds then the -P^omb- 

3 Implementation with usage of optimal solu- 
tions of the subproblems 

Putting the thoughts from the previous section together you get the following 
algorithm described as pseudo-Code to get higher objective values of a MILP- 
instance with only 0-1 variables: 



1: derive Pi from given MILP-instance 

2: load Pi into LP-solver 

3: solve Pi and save one optimal solution x and the fitting dual solution y 

4: for all i, where {x)i is not integer do 
5: case j = 1: 

6: add inequality xi <Q to Pi (so getting Pi^i) 

7: solve this LP by usage of the old dual solution y and get y -I- q^i as dual 
solution 

8: calculate all A[;^(ci^i)- {as defined in 12.11) 
9: case j = 2: 

10: add inequality a;^ > 1 to Pi (so getting Pi^2) 

11: solve this LP by usage of the old dual solution y and get y + Ci,2 as dual 
solution 

12: calculate all A[ 2(^,2) - {as defined in 12.11} 
13: ujiifi) ^inm{uJi^i{cis),uJt,2{ci,2)) 
14: if uJi{fi) > then 
15: mark i 

16: if Normalization trick is wanted then 
17: if Wi,i(ci,i) < Wi,2(ci,2) then 

18: for all r: A[2(ci,2) = Wi,i(ci,i)(wi,2(ci,2))"^A[_2(ci,2) 
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19: else 

20: for all r: A[_i(Ci,i) = Wi^2(ci,2)(wj,i(ci,i))~^A[_i(ci,i) 

21: end if 

22: end if 

23: for all r: Af^if,) = max(A[i(c,,i), A[2(c.,2)) 

24: end if 

25: end for 

26: build new LP-instance R with all marked i in variables as described in 

o 

27: solve R 

In first implementation I was not able to use the old solution in lines [7] and 
[TT] effectively. This has quite some impact because of the degeneration of the 
optimal dual solution in most of the prominent problems. 

To understand this let's consider you have chosen an optimal y with {y)m > 
for a problem P. But also an optimal y exists with {y)m = 0. Now for all i 
one Cij could exist where (jj + Cij)m = 0. This leads to the situation that no 
file could be combined ensured by the inequality of R, which deals with the fact 
that dual variables for inequalities should be positive. But if you had chosen the 
other y, you would have less problems. As a side-remark it should be noticed 
that also all Pij normally then have a degenerate dual solution space. 
The way to use the old solution y as a starting point for solving Pi j has two 
benefits: First the optimal solution should be found faster numerically and sec- 
ondly normally when dealing with degeneracy the above described effect should 
happen less often. 

The above algorithm has been implemented with the Open-Source package glpk. 
In this program all MILP are transformed to be Minimum-problems by exchang- 
ing the sign of the objective. So some results on MILP have an extraordinary 
sign. The problem library MIPLIB2003 has been processed partially getting the 
results on the following page. 

In the given table the column Branches measures the number of variables where 
branching took place. The actual number of calculated LPs is 2 times more plus 
the initials LP and the combining LP. The column Degree is equal to of 
the optimal value of the combining LP. It gives an idea how much branches can 
be used at the same time but also in an effective way. The current implementa- 
tion separates already the normal and the dual LP because of future plans. So 
we measure both in seconds. Furthermore also the total time for all calculation 
is presented. 

The given table only includes those instances, where the program finished within 
1 hour. Furthermore for some instances no advantage at all was made, because 
at no branch there was an increase in both nodes at all. For more investigations 
these problems might be put out of scope. On the other hand for the instance 
trl2-30 a quite high lower bound was reached: Starting from 14210 the bound 
79695 is reached which is much nearer to the real value at 130596. 
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Instance 


Pure LP 


Bound Inc 


Branches 


Degree 


Normal 


Dual 


Total 


lOteams 


917.00 


0.00 


159 


0.00 


3 


1 


11 


alclsl 


997.53 


1195.33 


173 


53.92 


5 


1 


24 


aflowSOa 


983.17 


14.28 


31 


5.19 


1 





2 


aflow40b 


1005.66 


7.16 


38 


1.80 


3 


2 


17 


air04 


55535.44 


84.61 


292 


1.00 


59 


21 


1040 


air05 


25877.61 


72.54 


223 


1.00 


28 


5 


329 


arkiOOl 


7579599.81 


126.65 


81 


6.63 


5 


2 


12 


cap6000 


-2451537.33 


0.00 


2 


0.00 


12 


5 


18 


danoint 


62.64 


0.05 


34 


1.00 





1 


3 


disctom 


-5000.00 


0.00 


251 


0.00 


69 





129 


fiber 


156082.52 


15734.31 


47 


6.90 


1 





3 


fixnet6 


1200.88 


210.51 


60 


21.33 


1 





2 


gesa2 


25476489.68 


81043.25 


58 


35.91 


2 


1 


5 


gesa2-o 


25476489.68 


81891.56 


73 


36.70 


1 





4 


glass4 


800002400.00 


0.00 


72 


0.00 


1 





1 


harp2 


-74353341.50 


0.00 


30 


0.00 


3 


1 


6 


liu 


346.00 


214.00 


536 


1.00 


2 


1 


16 


mannaSl 


-13297.00 


0.00 


872 


0.00 


10 


1 


92 


marksharel 


0.00 


0.00 


6 


0.00 











markshare2 


0.00 


0.00 


7 


0.00 











mas74 


10482.80 


42.52 


12 


1.19 


1 





1 


mas76 


38893.90 


24.86 


11 


1.62 











misc07 


1415.00 


0.00 


31 


0.00 


1 





1 


mkc 


-611.85 


0.00 


105 


0.00 


5 





16 


modOll 


-62121982.55 


0.00 


16 


0.00 


13 


1 


16 


modglob 


20430947.62 


69955.22 


29 


8.31 











mzzvll 


-22945.24 


0.00 


836 


0.00 


68 


1 


323 


mzzv42z 


-21623.00 


0.00 


676 


0.00 


48 


1 


278 


net 12 


17.25 


11.40 


429 


1.30 


27 


89 


3115 


noswot 


-43.00 


0.00 


28 


0.00 


1 





1 


nsrand-ipx 


48880.00 


0.00 


67 


0.00 


37 


2 


61 


optl217 


-20.02 


0.00 


29 


0.00 


1 





1 


p2756 


2688.75 


10.20 


30 


2.00 


3 





4 


pkl 


0.00 


0.00 


15 


0.00 











pp08a 


2748.35 


762.82 


51 


11.41 











ppOSaCUTS 


5480.61 


166.85 


46 


6.47 


1 





1 


protfold 


-41.96 


0.00 


449 


0.00 


7 


1 


34 


qiu 


-931.64 


0.00 


36 


0.00 


1 


1 


3 


roUSOOO 


11097.13 


5.44 


214 


4.32 


6 


1 


36 


rout 


981.86 


2.34 


35 


1.00 





1 


1 


setlch 


32007.73 


3904.90 


138 


64.56 


1 





2 


Seymour 


403.85 


1.50 


632 


3.30 


30 


4 


291 


sp97ar 


652560391.11 


241502.97 


194 


2.00 


89 


12 


522 


swath 


334.50 


0.40 


45 


5.71 


5 


1 


19 


timtabl 


28694.00 


137970.93 


136 


16.46 








1 


timtab2 


83592.00 


106311.17 


233 


27.02 





1 


4 


trl2-30 


14210.43 


65484.48 


348 


322.01 


1 





8 


vpm2 


9.89 


0.48 


31 


7.41 


1 





1 
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4 Two manual examples of the presented tech- 
nology 

4.1 A trivial one 

2/1,1 : xi^i + xi^2 > 1 and t/i,2 : Xi^2 + xi^a > 1 and yi,3 : a;i,3 + Xi^i > 1 
2/2,1 : X2,i + X2,2 > 1 and 2/2,2 : 2:2,2 + X2,3 > 1 and 2/2,3 : 2:2,3 + 2:2,1 > 1 
Minimize ui{x) = a;i,i + 0:1,2 + 2:1,3 + 0:2,1 + 2:2,2 + 2:2,3 and all variables have 
to be integer. 

Clearly by just viewing the problem the optimal value of u is 4. The opti- 
mal solution of the LP itself is Xj,; = 0.5 for i G {1;2} and / € {1;2;3} with 
objective 3. All dual variables ?/»,; have also the value 0.5. The next step is to 
make the case differentiations. Let's concentrate on a:j,i. Either 2:^,1 = holds 
(case j = 1) or a:j,i > 1 (case j = 2). 

Calculating the 4 different LPs we get the following values for the dual 
variables and objectives: 



1,J 


= 1 : 


2/1,1 


= 1, 


2/1,2 


= 0, 


2/1,3 


= 1, 


2/2,i 


= 0.5, 


U) 


= 3.5 


i,i 


= 2 : 


2/1,1 


= 0, 


2/1,2 


= 1, 


yi,3 


= 0, 


2/2,; 


= 0.5, 


LO 


= 3.5 


2,i 


= 1 : 


2/2,1 


= 1, 


2/2,2 


= 0, 


2/2,3 


= 1, 


2/i,i 


= 0.5, 


U} 


= 3.5 


2,i 


= 2 : 


2/2,1 


= 0, 


2/2,2 


= 1, 


2/2,3 


= 0, 


2/1,; 


= 0.5, 


OJ 


= 3.5 



The reduced costs of the Xij arc not of interest because all variables had in 
the LP-version of the problem no reduced costs. 

Following the definitions of the preceding main chapter we get for the A™ {aj ) 
the following values: 



i 


= i,i 




1 : 


A};} = -0.5, Alj = 


0.5, 


Ah3 _ 
^1,1 — 


-0.5, 


^1,1 — 


0, 


Wl,l 


i 


= i,i 




2 : 


Alll = 0.5, Aj;^ = 


-0.5, 


^1,2 — 


0.5, 


a2,1 _ 
^1,2 — 


0, 


Wl,2 


i 


= 2,.? 




1 : 


a2;1 = -0.5, A^;2 = 


0.5, 


^2,1 — 


-0.5, 


^2,1 ~ 


0, 


W2,l 


i 


= 2,j 




2 : 


All = 0.5, Alf^ = 


-0.5, 


^2,2 — 


0.5, 


Al.' - 
^2,2 


0, 


^2,2 




This 


gives 


the files values: 
















i = 


1 




Ai'^=0.5, Ai'^ = 0.5, 


A^3 


= 0.5, 


A?'' = 






0.5 




i = 


2 




A2'^ = 0.5, Aa'^ = 0.5, 


a2,3 
^2 


= 0.5, 


aL' - 
^2 — 


0, ui 




0.5 



So we reach the following LP for the combination of the files. 



Validity that yi,i > : 0.5Ai + OA2 < 0.5 
Validity that 2/1*2 > : 0.5Ai + OA2 < 0.5 
Validity that ^yi'g > : 0.5Ai + OA2 < 0.5 
Validity that y2,i > : OAi + O.5A2 < 0.5 
Validity that y2,2 > : OAi + O.5A2 < 0.5 
Validity that 2/23 > : OAi + O.5A2 < 0.5 
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With the following objective 0.5Ai + 0.5Ai. As the objective of this problem 
is 1 we can derive that the lower limit of the original MILP is at least 3 + 1 = 4. 
As there are solutions with this objective, this limit is sharp 

This trivial example also gives the right idea that for case differentiations in 
different part of LP, which are not connected, that the files can be combined. 



4.2 Almost a real one 

a;i + a;2 + X3 < 1 

X2 + X3 + X4 < 1 
Xs + X4 + X5 < 1 
X4 + X5 + Xi < 1 
X^ + Xi + X2 < 1 

Maximize w^x) = xi + X2 + X3 + X4 + xr-, and all variables have to bo integer. 

Let i S {1;...;5}, the optimal dual solution of the LP is simply yi = ^ with 
w = |. The optimal solution of the MILP has objective of 1. For example we 
branch on the two cases xi = and xi > 1. We get then the following dual 
solutions: 



a;i = 
a;i = 1 
a;i = 1 



y= (0 0.5 0.5 0.5) with oj = 1.5 

y= {1 10) with uj = l 

y = (0.5 0.25 0.25 0.5 0.25) with uj = 1.5 (Normalization!) 



Naturally the theory can also be applied to maximum problems. So we get for 
Ai: 

/I 1 1 1 1 N ■ 1 1 

Ai = with OJ = — 

^31212312^ 6 

Via using the symmetry of the problem we get the combination of the files the 
following LP: 



< 



1 



j^Ai + |a2 + j^As + |a4 + < I 

Jo-^l + T2'^2 + ■3A3 + Y9A4 + jAb < 



3A1 + i^A2 + i^^3 + 3A4 + i^'^5 < I 

Y2A1 + 3A2 + j^Xa + j2^'i + 3A5 < 3 

With objective |Ai + iA2 + gAs + IA4 + gAs. 
The optimal solution is A^ = and objective is So that we have shown 
that the maximum in our original MILP is less or equal § ~ |§- 

At this point it is again possible to make a case differentiation on a;i = or 
xi > 1. If we assume a;i = the above LP would have the following form: 
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|Ai + j^X2 + ^Aa + ]^A4 + j2^^ — I 

-|Ai + ^3A2 + Y^As + ^3A4 + Y|A5 < ^ 

— ^Ai + 1^X2 + TjAa + Y|A4 + 3A5 < I 

|Ai + Y|A2 + ^Aa + ^3A4 + j^As < ^ 

-gAi + 3A2 + j^Xa + ^X4 + 3A5 < 3 

And for xi > 1: 

"I'^l + I^'^2 + IA3 + J^X4 + J^X^ < I 

|Ai + 3A2 + Y|A3 + 3A4 + Y|A5 < ^ 
"^•^1 + i^'^2 + 3A3 + Y|A4 + 3A5 < I 

|Al + Y|A2 + ^A3 + ^3A4 + j^X5 < I 

-3A1 + 3A2 + j^Xa + j^A4 + 3A5 < 3 

We'll stop the ealculation at this point. We could now calculate some Ai via 
checking the differences for the resulting normal variables Aj and build a new 
LP, which would represent the possibility of combining the changes of A, when 
doing the case differentiations. By this we would again reach better upper limit 
for the original problem. 



It possible to iterate this method until infinity, but some manual calculations 
have shown that the real lower limit will never be reached in this way. 

The author likes this example pretty much. It shows that non-trivial com- 
bining are possible, and that the method can be iterated in a surprising way. 
But it also shows some limits. The above MILP is easily solved by doing the 
4 case differentiations on xi and X2. Furthermore it is even possible to make 
another case differentiation on one inequality. It is clear via the first inequality 
that xi = 1 or a;2 = 1 or X3 = 1 or xi = a;2 = X3 = holds. Via this case 
differentiation it is seen most quickly seen that the optimal value of the MILP 
is 1. The author thinks that such case differentiation on inequalities should be 
investigated as an alternative to the normal branching on one variable especially 
in the 0-1 MILP-context. 



5 Effectiveness for finding good dual values in 
the branching LPs 

5.1 Sidestep: Searching for integrity 

Only loosely connected to the rest of the paper we now investigate those MILPs 
and the derived LPs which have non-degenerate optimal solution space. As for 
all branching investigations especially in this paper the number of non-integers 
variables, which are supposed to be integer, should be as little as possible to 
reduce the running time of an implementation. 
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Therefor we assume that we have an optimal solution vector xq- We just freeze 
the objective function to the optimal value, so getting an additional equality. We 
set additional bounds on all integer variables x„ via [(a;o)n] < a;„ < [(a:o)n] + 1- 
This is a good valid definition also for MILPs which are not binary problems. 
In general it might be useful to try out use some bounds like [(xo)^ — < 
Xn < [{xo)n — e] + 1. This gives some integer variables more freedom to become 
non-integer to allow other integer to more integer in the general MILP case. 
Finally we now define for all integer variables n the objective of the minimization 
problem to enhance the variables to become integer. 

c„ = 1 - 2([(a;o)„] - ixo)n) 

The new vector xi is now got by solving this LP to optimality. We now calcu- 
late again new c„ in the described manner so that we have an iterative process. 
With this definition we have a good tool which gives almost integers a good 
motivation to become integer not hindering others in this process to give up 
integrity. 

Notice that the choice of the c„ was done by experiments. It cannot be rea- 
soned yet, why this choice was in the experiment superior to other approaches. 
Also the convergence of the method has only investigated by experiments. It is 
imaginable that reducing the number of non-integers might enhance the quality 
of some heuristic cuts findings, but the author has not received in his limited 
experiments any valuable result. For sure for the class of this paper in chapter 
[6] this is not relevant as the cuts are only defined by certain dual values and 
reduced costs. 

Clearly this presented idea was motivated by the feasibility pump [2] to gener- 
ate integer solutions. We present it also here because the following method was 
developed in spirit of this easy algorithm. 

5.2 Measurement of good dual values 

We will again concentrate on 12.71 Looking at the inequality there you see that 
each file eats up certain inequalities (dual values) or variables (reduced costs). 
So to find good values, you have to search for files and hereby for dual variables 
who eat less of our stock but still give a good improvement in the objective 
function. First we have to define what it is the meaning is of eating up the 
stock of inequalities and variables. 

Suppose again you have made a case j of case differentiation i with a better dual 
variable set. Some of the some the might be negative, but when the file 
is glued together by maximizing we suspect that the value A[ will be positive. 
Anyhow even if it is really negative, quite likely no other case differentiation 
will need the negativeness. So we have argued to measure all negative A^ j as 
0. As a general approach measure distance to the starting point yo we can now 
define: 
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r 

We also needed £> > to make the below algorithm work. 

We leave out the problem of setting the dr values, but first use this defini- 
tion to get better dual values. Therefor we create more artificial variables Zr in 
the dual space. Via z > and z > A[j and D = J^r ■^r we reflect the definition. 
The idea is now to subtract D from the objective w in that way the optimal 
value of the LP created by the case j of the case differentiation will have the 
same objective in our new artificial LP as j/o- Let ^ be the difference of this 
solution and ujij the increase of the objective. Then we set: 

adj _ Dujij 
So we have found an objective with the desired property. 

Via our definitions we have assured that D is always positive. When the new 
LP is now solved to optimality and point yYf^ is found, it is therefor clear that 
its optimal value is in the polytope Pi j. Furthermore the following can easily 
be proved: 

, .new , , 

So in terms of effiency of eating up the stock the new point is better or equal 
than the first optimal point. When it is equal, then the space Pij might be 
often one dimensional. But additionally by our definition we have not given up 
the wish for good objective gain. 

Also this trick can easily be iterated, visible already by our definitions. 
The algorithm can easily be enhanced that it works on finding better values of 
files, but this generalizations will not be presented here. Also in an implemen- 
tation you could use the values of the already manipulated cases of the case 
differentiation. When an inequality or a variable has been eaten up a bit the 
new case should have this meal for free. 



We now have had some fun with preparing effecient meals of inequalities and 
variables, but one crucial point of the receipt is still open: the definition of the 
dr- If you define all dr = 1, then the big Ir in 12.71 will get two much attention. 
Tiny Ir, which might always hinder the combining of the files, are overlooked. 
So the natural choice is to set dr = j- which will give all non-null inequalities 
and all variables with real reduced costs the same weight. Sadly this theoreti- 
cally good approach lead in the author implementations to numerical problems. 
Often the manipulated LP was bad conditioned. So the author suggests to use 
a lower limit like 0.01 for all dr- 
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The author has implemented the above algorithm partially, but with some dis- 
appointment for him. He didn't manage to use the old optimal solution in the 
software package glpk, so each manipulated LP had to be solved from scratch. 
This lead to too long running times. He thinks also that this time increase is 
only partial because of some missing features of the used software. Using the 
theoretical good reasoned approach of this section might just be too numerical 
complex because of the sheer number of added constraints. 

5.3 Finding quickly the dual values 

In the preceding subsection we described a theory to find dual solutions with 
good objective which could be considered as near to the basic dual solution 
j/o- We did this via introducing variables, which measured the distance to the 
original. Another approach in finding good dual solutions and so files is to use 
additional inequalities. Depending on the aim this can result in files which fit 
better to each other or in dual solution which can be calculated very quickly. 
Suppose you have already made a branching with a file /i . Then to combine a 
second branching with the first you just demand: 

Speaking in terms of dual inequalities you get lower bounds for those variables 
ym-i which were nonzero in the basic dual solution. Furthermore restrictive 
dual inequalities which weren't in the solution vector yo become in general 
more restrictive. The benefit of this approach is that using those additional 
restrictions it is clear that the two branching can fully combined. In terms of 
12.71 this means that Ai = A2 = 1. 

Naturally the idea can easily be iterated via demanding: 

i 

Let's do at this point another sidestep. Suppose that both files consist of two 
cases, which is the normal case for MILPs. You have done the case differentia- 
tions to combine these two cases, so you have solved 2-1-2 = 4 calculations. But 
doing instead a case differentiation on the the 4 cases (2*2 = 4), which already 
enumerate all possible combinations, you would have the same calculation time. 
But you will have a least better objective increase with these 4 cases than with 
combining the two case differentation. So for a clever implementation of these 
sketched algorithms the principle of parallel branching should not followed too 
strictly. Doing all case differentiations on p cases might be interesting, when 2^ 
ist still comparable to 2p. 

But let's get back to additional restrictions for the dual inequalities. We start 
with a metaphor: Linear equations describe the nature. When a butterfly flies 
up in Brazil the emerging circulations won't normally influence the weather in 
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Europe. Speaking in terms of LP an introduction of a new variable in a dual 
inequality has often only effect in those inequalities, which are strongly bound 
to the related inequalities. So it is striking thought a neighborhood of a new 
variable, and to freeze all other variables, which are not in the neighborhood. 
This should have a big reduction of the running time as a result. If we have 
good criteria for the neighborhood the objective increase will often be compa- 
rable to the objective increase of the new dual inequality without freezing. So 
it is quite likely that the resulting files might also be effective in terms of the 
last subsection. 

Clearly defining neighborhood by the graph of the inequality system or other 
means is a complex story. The definition of the neighborhood should also be 
dependent on the type of the MILP. 

Suppose you have a good neighborhood definition. Then the technique of freez- 
ing most of the dual variables might also be an alternative to the strong branch- 
ing method, which determines in a branch and cat framework the next variable 
to branch on. The strong branching relies on a good and steep implementation 
of the dual simplex, where you use the values of the objective after only some 
iteration of the Simplex algorithm. It should be noticed that such a steep dual 
Simplex algorithm is not a prerequisite of the algorithm. So my approach can 
be used in less advanced packages like glpk to do something similar. 

This chapter could be described as visionary or even dreamy, anyhow the sub- 
ject of this paper is to present the author's idea on the subject. To make it 
complete the author had just add it, otherwise he would always think that his 
idea have not been presented decently. 

6 Application of the theory to produce cuts for 
the original MILP 

When thinking of building in the look ahead term of the concurrent branching 
into a existing branch and cut framework, the dual combining inequality of 12.71 
doesn't fit easily. It is striking that instead of that additional dual LP you 
would like just to have more restrictions in normal space instead. Generation of 
cuts should be the aim. We'll see in this end of this chapter that this is possible. 

We start at that point that we have an optimal dual solution y'^ with only 
one file /i, which describes a case differentiation. We now use a new special 
form of I2.9[ we freeze the fi as linear factors of a scalar A. Contrary to the 
special form 12.71 we let y really play the role of dual variables and not fix it to 
Ir-values. So we have as variables the vector y and the scalar A. Transformed 
back via dual-dual correspondence this will give us more or less the normal in- 
equalities and equalities plus one additional equality, which will be our cut. But 
let's stick to the details. We have the following dual inequalities for this special 
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model: 



Definition 6.1 



X, 



add . 



Vk + AA"(/i) < c„Vn, 




Where (n) goes over all normal variables, (m) over all dual variables (in- 
equalities + equalities), Cm.n are the matrix coefRzient of the LP and c„ are the 
coefhzients of the normal objective. The objective function of this dual problem 
^rnVm + ^^{fi), whcrc the &m SLTC the right hand side of the inequalities 
and equalities. 

This dual problem can be transformed to normal space: 
Remark 6.2 



The objective is just c„x„ as the normal objective. This looks already 
interesting, but prior the final transformation to get a cut we must first proof 
that this system is valid for all integer values. Sadly the proof is very indirect, 
a direct proof was not discovered by the author. 

Before doing the proof we must first study the reuse of files for other basic 
solutions than the starting one. In 12.71 we had some files, which were tried to 
be added to some basic solution. If we would have used another solution with 
other /^.-values, we can naturally use the methodology also. Adding the files 
might still be possible. The only thing which might happen that all in 12.71 
have to be 0. Same holds if we have a more strict LP. Then the dual solution 
has only some more variables, but the original ones are still there. 

Remark 6.3 The lookup term via combining files can still be used to a more 
strict version of the starting normal LP. For incompatible problems it can only 
be defined, when the missing dual variables (m) of the LP, where the file should 
be applied, the A™ are negative or 0. 

This remark also clarifies the usage of the lookup terms for integration in 
branch and cut frameworks. The file info of a LP remains valid for all descen- 
dants and is normally invalid for other descendants of the root LP. 
Consider you have an integer solution of the LP. Then it is clear that this integer 
solution is the only optimal solution of a version of the LP, which has been made 
more strict via adding more inequalities. This more restrictive LP is represented 
in the dual space by a loosened LP. We still can try to add our file in the dual 
space. Via this we get the special model 16.11 for the loosened dual inequality. 




n 



A : AT{f^)xT + E ^7{f^)xn > u;{f.) 



n 
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For this model the optimal objective has to be identical to the dual LP and the 
normal LP. Otherwise we would prove that the optimal integer solution of the 
more strict LP has to have bigger objective than the already existing integer 
solution, which is a contradiction. 

The optimal dual solution of our loosened LP I6.1l in the dual space is a solution 
of a more restrictive LP I6.2l than the original one. So we have found an optimal 
normal solution, which also holds for the more restrictive inequalities. As we 
had said that the original solution was the only optimal solution, it most be 
identical to the new one. So the original solution has to fulfill 16.21 
So all integer solutions fulfill it. 

As a final step we can state that the normal inequality system is equivalent 
to: 

n 

m n 71 

Or written with slack variables Sm — — Om^TiXn'' 

y7n '• ^ ^ C-m-Ti^Ti ^ hmS'^ 
n 

A : ^ A™(/Os™ + ^ Ar(/.)x„ > uj{n) 

m n 

The cut A in its last form is surprisingly short and that's where we aimed 
to go. The dual LP 16.11 has at least an increase of the value of the optimal 
solution of uj{fi). So the same holds for its dual which is equivalent to last the 
inequalities. 

Theorem 6.4 (Generation of branching cut) The following inequality is true 
for all integer solutions: 

A : ^ Araosm + E > ^(/o 

m n 

The object increase by adding one cut of this kind is at least uj{fi). 

6.1 Thoughts about the new cuts 

First we apply this cut to the problem in 14.21 and we get: 

11 1111 

— Si H So H s-i H — sa H sk > — 

3 12 12 -^ 3 12 ^ - 6 

9 6 6 6 6 11 5 

12^^ + 12^^ + 12"^ + 12^^ + 12"^ - 12 - 6 



19 



3 11113 

When applying = as case in the original problem, you get the solution 
vector (0, ^, I;, |, 5). This solution is equahzing the above cut. And for the 
other case Xi = 1 with solution vector (1, 0, 0, 0, 0), this is also sharp. 

Would be have chosen the file without normalization, we would have got: 

3 

Xl + X2 + Xz + Xi + x^ < - 

At this cut (0, is equahzing the cut, but (1,0,0,0,0) not. This 

could easily investigated more abstract. Anyhow we state, normalization leads 
to sharper cuts, which is true in general. 

Remark 6.5 The defined class of the cuts are sharp, in the sense that it can 
be used to get a proof that an integer solution is the optimal one. 

For binary problem this is not difficult to understand. Just make a case 
differentiation over all cases. As we have a binary problem this is finite number, 
then this one derived cut is already sufficient. In general you have to argue a 
bit cleverer, anyhow the remark is true. 

The above statement has no practical implication, as by making a case differ- 
entiation you already had a proof. 

Also the derived cut will be similar to the objective as above xi + X2 + x^ + 
X4 + X5 < ^ was already the objective function, but not with the real optimal 
integer objective value. 

We will now get a little philosophical. Consider you want to make a proof 
that an Integer solution with objective luq is an optimal one. By a big case dif- 
ferentiation you can produce one single cut, so that the best integer solution has 
to be almost loq. But the cut is already very similar to the objective function. 
So if you make a simple case differentiation after adding the cut, the objective 
will not increase in any branch at all. Thus the big mighty cut is irrelevant for 
the proof at all. This suggest the below expectation: 

Remark 6.6 Many easy little steps are better than a few big complex steps. 

If you analyse the proof of the validity of the cut, things like parallelism of 
branching are not used at all. This could lead to the wrong conclusion that the 
whole dual theory of concurrent branching is redundant. The produced cuts in 
normal space yield the at least the objective increase as improvement with use 
of files at 12.61 in the dual space. This has not been shown explicitly here, but it 
is understood easily, when you change the starting model in lG.ll so it uses more 
than one file. 

In the dual space you can calculate which files to use, measure the files and so 
the cuts. In dual space you just have better control of what you do. 
It would be only seeing the top of an iceberg, if the dual theory would have not 
been included here. And last but not least the author first developed the dual 
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theory for his idea of concurrent branching. Based on this idea he discovered, 
that it might be reapphed again to normal space. 

What's left to be done? 

The philosophical statement should be reasoned by some examples. 
An implementation of the ideas to produce very quickly cuts or lookahead terms 
should be done, to really measure the usefulness of the theory. Particular for 
binary problems it would be interesting to generate cuts on inequalities or equa- 
tions with the discussed technique of fixing most of the dual variables. 
Furthermore it is most interesting to classify other cuts generation algorithm 
in our terms or vice versa. Also applying the sketched idea of effectiveness of 
lookahead terms and so cuts might to other cuts classes might be a fruitful idea. 
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